Japanese broadcast news transcription
نویسندگان
چکیده
In this paper, we describe the on-going development of a Japanese Broadcast News Transcription system at BBN Technologies. This is a collaboration between BBN and NHK to use automatic speech recognition technology to provide live closed caption for NHK’s TV news programs in Japan. We describe what the NHK Broadcast News Corpus comprises and how we adopted transcription technology developed for Hub-4 English broadcast news task to achieve an overall word error rate (WER) of less than 5% for Japanese TV news programs. We also report on how we obtained 30-50% relative WER reduction for weather forecast and sports news by the use of micro-domain lexicons and language models.
منابع مشابه
Toward Automatic Recognition of Japanese Broadcast News
In this paper we report on automatic recognition of Japanese broadcast-news speech. We have been working on largevocabulary continuous speech recognition (LVCSR) for Japanese newspaper speech transcription and achieved reasonably good performance. We have recently applied our LVCSR system to transcribing Japanese broadcast-news speech. We extended the vocabulary to 20k words and trained the lan...
متن کاملToward automatic transcription of Japanese broadcast news
In this paper, we report on the automatic recognition of Japanese broadcast-news speech. We have been working on largevocabulary continuous speech recognition (LVCSR) for Japanese newspaper speech transcription and have achieved good performance. We have recently applied our LVCSR system to transcribing Japanese broadcast-news speech. We extended the vocabulary from 7k words to 20k words and tr...
متن کاملJapanese Broadcast News Transcription and Topic Detection
This paper reports recent advances in Japanese broadcast news transcription and automatic topic detection from the transcribed news speech. To cope with the variability of the readings for each word, a new method for incorporating reading probability of each word in the decoding process is proposed. As a realistic solution to the new-word problem, a new method is proposed, in which new words ar...
متن کاملRecent advances in Japanese broadcast news transcription
In this paper, we report on language modeling and acoustic modeling studies for Japanese broadcast news speech recognition. We constructed a language model that reduces recognition errors by utilizing context-dependent readings of Japanese characters. We also introduced filled-pause modeling into the language model. To improve the model’s performance for a series of sentences spoken by one spea...
متن کاملImprovements in Japanese Broadcast News Transcription
This paper reports on recent improvements in Japanese broadcast news transcription and topic extraction. We constructed a language model that depends on the readings of words in order to prevent recognition errors caused by context-dependent readings of Japanese characters. We also introduced interjection modeling into the language model. To improve the model’s performance for a series of sente...
متن کامل